Remarks 

Claims 1-18 are pending in this application and stand rejected. Claims 1, 7 and 13 are 
amended herein. 

Response to Rejection Under 35 USC §101 

Claims 1-12 are rejected under 35 USC 101 as not falling within one of the four statutory 
categories of invention. This rejection is overcome in view of the amended claims. 

Claims 1-6 were rejected for not specifying "any physical components for carrying out 
each of the steps described." See Examiner's Answer, page 3. Claim 1 has been amended to 
associate the method steps with a physical component. Specifically, amended claim 1 now 
recites "an image database" " a motion analysis block executed by a processor and "a look-ahead 
detector executed by the processor." Support for this amendment is found throughout the 
specification, for example at ^ [0025]-[0029] an d FIGS. 1 and 2. Accordingly, claims 1-6 now 
specify a physical component for carrying out the method steps and now recite statutory subject 
matter. Hence, reconsideration and withdrawal of their rejection is respectfully requested. 

Claims 7-12 were rejected for not specifying physical components for carrying out the 
described operations. See Examiner's Answer, page 4. Claim 7 has been amended to now recite 
"a processor" and "a computer readable storage medium." Thus, claim 7 now recites physical 
components for carrying out the described operations, meeting the statutory requirements of 35 
USC § 101 . Support for this amendment is found throughout the specification, for example at 
[0028]-[0030] and FIG. 2. Therefore, claims 7-12 now meet the requirements of 35 USC § 101, 
so reconsideration and withdrawal of their rejection is respectfully requested. 

Response to Rejection Under 35 USC §103(a) 
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The Examiner rejected claims 1-18 under 35 USC § 103(a) as unpatentable over PCT 
Application No. US97/08266 to Chang et al. ("Chang") in view of U.S. Patent No. 6,670,963 to 
Osberger. ("Osberger"). This rejection is respectfully overcome in view of the amended claims. 

As amended, representative claim 1 recites: 

A method of detecting at least one of a pan and a zoom in a video 
sequence, comprising: 

selecting a set of frames from a video sequence; 

determining a set of motion vectors for each frame in the set of frames; 
determining a motion angle for each motion vector; 
identifying at least two largest regions in each frame using a look-ahead 
detector executed by the processor, wherein the first largest region 
includes motion vectors with substantially similar motion angles and 
occupies a largest number of pixels in a frame and the second largest 
region includes motion vectors with substantially similar motion angles 
and occupies a second largest number of pixels in a frame; 
determining percentages of each frame covered by each of the at least two 
largest regions using the look-ahead detector; 

determining a statistical measure of the motion angles for at least one of 
the two largest regions; and 

comparing the percentages and statistical measure to threshold values to 
identify at least one of a pan and a zoom in the video sequence. 
The claimed method detects the presence of a pan or a zoom in a video sequence. 

Initially a set of frames are selected from the video sequence and a set of motion vectors are 

determined for each frame in the set. A motion angle describing motion vector orientation is 

then determined for each motion vector. At least two largest regions in each frame, the first 

largest region including motion vectors with substantially similar motion angles and occupying a 

largest number of pixels in a frame and the second largest region including motion vectors with 

substantially similar motion angles and occupying a second largest number of pixels in a frame 

having motion vectors with substantially similar motion angles are then identified. The 

percentage of each frame covered by each of the at least two largest regions is then determined. 

A statistical measure of the motion angles for at least one of the identified two largest regions is 
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computed and compared to threshold values to identify a pan or a zoom. Support for the 
amendments to the independent claims is fround throughout the specification, for example at 
[0033]-[0036]. 

Thus, the claimed method detects a pan or a zoom by identifying the two largest regions 
of each frame in a video sequence having substantially similar motion vector orientation and 
occupying the largest and second largest number of pixels in a frame. Motion angles are 
computed for each motion vector and the motion angles are used to identify the regions that have 
substantially similar motion vector orientation. By identifying the two largest regions of each 
frame having a substantially similar motion vector orientation and occupying the largest number 
of pixels and second largest number of pixels in a frame, the claimed method allows for pan or 
zoom detection without computing global motion parameters (i.e., computing motion where most 
of the image points are uniformly displaced). Further, determining motion angles for each 
motion vector allows for rapid identification of frame regions having substantially similar motion 
vector orientation by evaluating the similarity of the motion angles. Determining a statistical 
measure for one of the largest frame regions, rather than the entire frame, reduces the 
computation necessary to detect a pan or a zoom in the frame, beneficially improving the 
efficiency of pan or zoom detection. 

As amended, representative claim 1 recites, in part "identifying at least two largest 
regions in each frame using a look-ahead detector executed by the processor, wherein the first 
largest region includes motion vectors with substantially similar motion angles and occupies a 
largest number of pixels in a frame and the second largest region includes motion vectors with 
substantially similar motion angles and occupies a second largest number of pixels in a frame" 
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and "determining percentages of each frame covered by the at least two largest regions." 
Independent claims 7 and 13, as amended, recite similar elements. 

Chang identifies areas of a frame having motion vectors different than non-moving areas 
of the frame (Chang, page 17, lines 4-6). To identify the moving portions of the frame, Chang 
compares motion vectors to a predetermined threshold value, eliminating areas of the frame with 
motion vectors less than the predetermined threshold value (Chang, page 17, lines 7-11). The 
identified portions of the frame with motion vectors exceeding the predetermined threshold value 
are analyzed using a linear transformation and a translation to more particularly identify moving 
and non-moving regions of a frame. However, this determination accounts for the magnitude of 
the motion vectors associated with a portion of the frame and is unrelated to the number of pixels 
occupied by the identified regions. In contrast, the claimed invention identifies "at least two 
largest regions in each frame" where the "first largest region includes motion vectors with 
substantially similar motion angles and occupies a largest number of pixels in a frame and the 
second largest region includes motion vectors with substantially similar motion angles and 
occupies a second largest number of pixels in a frame" to identify the largest areas within a 
frame having substantially similar motion vectors. Chang does disclose, or even suggest, 
identifying the largest and second largest number of pixels in a frame that include substantially 
similar motion angles. 

At most, Chang compares a number of contiguous blocks associated with a motion vector 
to a threshold value to improve detection accuracy (Chang, page 17, lines 10-13). However, this 
comparison merely prevents false detection of small objects, and does not identify "at least two 
largest regions in each frame" where the "first largest region includes motion vectors with 
substantially similar motion angles and occupies a largest number of pixels in a frame and the 
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second largest region includes motion vectors with substantially similar motion angles and 
occupies a second largest number of pixels in a frame," as claimed. Rather than identify regions 
having substantially similar motion angles and occupying the largest or second largest number of 
pixels in a frame, Chang identifies all regions in a frame having any motion vector exceeding the 
threshold value and associated with more than a minimum number of contiguous blocks. This 
merely places a minimum on the size of the regions identified, so any region exceeding this 
minimum value is identified, rather than the regions occupying the largest number of pixels and 
the second largest number of pixels in a frame. 

Further, as Chang does not identify "at least two largest regions in each frame having 
motion vectors with substantially similar motion angles," Chang cannot determine "percentages 
of each frame covered by each of the at least two largest regions," as recited in the independent 
claims. Because Chang detects all regions in a frame including motion vectors exceeding the 
predetermined threshold value, there is no determination of the "percentages of each frame 
covered by each of the at least two largest regions," but merely an identification of all portions of 
the frame satisfying minimum requirements. Further, the Examiner admits that Chang does not 
explicitly disclose determining percentages of each frame covered by the at least two largest 
regions. See Final Office Action dated May 2, 2008, page 4. 

Osberger fails to remedy the deficient disclosure of Chang. Rather, Osberger discloses a 
segmentation algorithm dividing a video frame into a plurality of regions based on color and 
luminance (Osberger, Abstract). The segmentation algorithm processes both a current frame and 
a previous frame to produce motion vectors for the current frame (Osberger, 2:33-37). However, 
Osberger examines motion vectors associated with the complete current frame and the complete 
previous frame to generate an importance map for the current frame (Osberger, 3:23-30). There 
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is no disclosure or suggestion in Osberger of "identifying at least two largest regions in each 
frame using a look-ahead detector executed by the processor, wherein the first largest region 
includes motion vectors with substantially similar motion angles and occupies a largest number 
of pixels in a frame and the second largest region includes motion vectors with substantially 
similar motion angles and occupies a second largest number of pixels in a frame" and 
"determining percentages of each frame covered by the at least two largest regions," as claimed. 

Osberger merely examines a specified percentile of the camera motion compensated 
vector map to estimate motion in a current frame. For example, Osberger examines the 98 th 
percentile of the camera motion compensated vector map and disregards the remaining 2 percent 
of the motion vectors to determine the amount of overall motion in a complete frame vector map 
(Osberger, 7:61-64). This estimation does not identify "at least two largest regions in each frame 
using a look-ahead detector executed by the processor, wherein the first largest region includes 
motion vectors with substantially similar motion angles and occupies a largest number of pixels 
in a frame and the second largest region includes motion vectors with substantially similar 
motion angles and occupies a second largest number of pixels in a frame" or "determine 
percentages of each frame covered by the at least two largest regions," but merely discounts a 
portion of the motion vectors in a complete frame during analysis. Like Chang, Osberger does 
not identify specific regions within a frame having substantially similar motion vectors and 
occupying "a largest number of pixels in a frame" and "a second largest number of pixels in a 
frame," but analyzes a frame in its entirety. As Osberger does not identify different regions 
within the analyzed frame, Osberger also does not determine "percentages of each frame covered 
by each of the at least two largest regions." The percentile of the compensated motion vector 
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map analyzed in Osberger is not associated with any region of the frame, much less the two 
largest regions of the frame, but is a general map of all motion in a complete frame. 

Hence, the motion analysis disclosed in Osberger examines individual frames in their 
entirety and does not identify "at least two largest regions in each frame using a look-ahead 
detector executed by the processor, wherein the first largest region includes motion vectors with 
substantially similar motion angles and occupies a largest number of pixels in a frame and the 
second largest region includes motion vectors with substantially similar motion angles and 
occupies a second largest number of pixels in a frame" or determine "percentages of each frame 
covered by the at least two largest regions," as claimed. 

While Osberger produces motion vectors for a current frame by processing the current 
frame and a previous frame, the claimed invention identifies a first largest region in a frame 
including motion vectors with substantially similar motion angles and a second largest region 
including includes motion vectors with substantially similar motion angles and occupying a 
second largest number of pixels in a frame. The percentage of the frame covered by each of the 
identified largest regions is computed and analyzed to identify a pan or zoom. Rather than 
determine the percentage of a frame covered by specific regions, Osberger takes "the m 4 
percentile, such as the 98 th percentile, of the camera motion compensated motion vector map" to 
estimate the amount of motion in a scene using a subset of the camera motion compensated 
motion vector map (Osberger, col. 7, lines 58-64). This motion vector map describes all motion 
in a complete frame, without identifying distinct regions within the frame, much less regions 
within the frame having substantially similar motion angles and occupying the largest number of 
pixels or the second largest number of pixels in the frame (Osberger, col. 8, lines 10-34). The 
analysis in Osberger of the motion vector map does not determine the percentage of the frame 
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individually covered by each of the identified largest regions, but merely eliminates a percentile 
of the motion vectors from the frame as a whole. There is no correlation disclosed in Osberger 
between the percentile of the compensated motion vector map and individual regions within the 
frame or the size of specific regions within the compensated motion vector map. 

Nothing in Osberger indicates that the disclosed motion vector map or percentile analysis 
identifies "at least two largest regions in each frame using a look-ahead detector executed by the 
processor, wherein the first largest region includes motion vectors with substantially similar 
motion angles and occupies a largest number of pixels in a frame and the second largest region 
includes motion vectors with substantially similar motion angles and occupies a second largest 
number of pixels in a frame" and or determines the percentage of a frame covered by each of the 
two largest regions of the frame having motion vectors with substantially similar motion angles. 
In contrast to the global analysis of the motion vector map for a complete frame disclosed in 
Osberger, the claimed invention identifies specific regions within a frame and analyzes the 
identified regions to determine the presence of a pan or zoom. As part of this analysis, the 
claimed invention determines the percentage of the frame covered by the two largest regions of 
the frame having substantially similar motion angles. Merely analyzing a specified percentile of 
a motion vector map does not determine of the percentages of a frame covered by the largest 
regions in the frame having motion vectors with substantially similar motion angles, but analyzes 
a subset of the motion vectors included in a complete frame, regardless of the location of the 
motion vectors within the frame. Osberger allows for evaluation of the difference in overall 
motion between two complete frames using a percentile of motion in the frames, but does not 
determine "percentages of each frame covered by each of the at least two largest regions" having 
substantially similar motion vectors, as claimed. The determination of an overall percentile of all 
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motion in a frame does not determine the percentage of a frame covered by specific regions 
within the frame. 

During prosecution of the current application, references have been made to hypothetical 
cases of frames with moving backgrounds and frames with non-moving backgrounds to support 
application of Osberger and Chang to the claims. See Final Office Action dated May 2, 2008, 
page 2; Examiner's Answer, pages 8-9. Although a frame having a single moving foreground 
object and a static background object would have two regions having different motion vectors, 
the statements in the Final Office Action date May 2, 2008, and the Examiner's Answer 
overlook claim elements when discussing this hypothetical example, and neither Osberger nor 
Chang explicitly disclose the hypothetical scenario presented. Furthermore, even if the 
combination of Chang and Osberger allows differentiation between a static background and one 
or more moving objects in the foreground, this identification does not identify "at least two 
largest regions in each frame using a look-ahead detector executed by the processor, wherein the 
first largest region includes motion vectors with substantially similar motion angles and occupies 
a largest number of pixels in a frame and the second largest region includes motion vectors with 
substantially similar motion angles and occupies a second largest number of pixels in a frame" or 
determine "percentages of each frame covered by each of the at least two largest regions," but 
identifies any region in a frame having different motion vectors than the frame's background. 

Differentiation between moving object and static background does not determine the 
percentage of each frame covered by the moving object or the static background, but merely 
indicates a difference in motion vectors between one or more moving objects and the 
background. This identification of moving objects is independent of the pixels occupied by the 
moving object or the background, but is based on differences in the motion vectors of different 
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portions of the frame. By determining the at least two largest regions in each frame and 
determining the percentage of the frame covered by each of the at least two largest regions, the 
claimed invention reduces the computation necessary to detect a pan or a zoom in a video 
sequence by using the at least two largest regions to represent motion within the frame. The 
disclosures of Chang and Osberger do not address the percentage of a frame covered by any 
identified moving objects, but analyze movement in the frame as a whole based on differences 
between motion vectors. 

Additionally, in the Examiner's Answer, it is alleged that "the Applicant's own 
specification shows that the percentages of the largest regions are summed, and the sum is 
compared to a threshold... just as described in Osberger." Examiner's Answer, page 10. 
However, the independent claims recite "determining percentages of each frame covered by each 
of the at least two largest regions," which is not disclosed in Osberger or Chang. The subsequent 
use of these percentages is not relevant to whether or not the cited references disclose this 
claimed element. Although the specification provides an example embodiment of the claimed 
invention where the determined percentages are summed, this is not recited in the independent 
claims, which specifically include the element of "determining percentages of each frame 
covered by the at least two largest regions." This claim element is not disclosed in Chang or 
Osberger. 

Thus, the cited references, taken alone or in combination, do not disclose or teach the 
claimed invention. Therefore, claim 1 is patentable over the cited references and withdrawal of 
the rejection is respectfully requested. 

As amended, independent claims 7 and 13 similarly recite "identifying at least two largest 
regions in each frame using a look-ahead detector executed by the processor, wherein the first 
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largest region includes motion vectors with substantially similar motion angles and occupies a 
largest number of pixels in a frame and the second largest region includes motion vectors with 
substantially similar motion angles and occupies a second largest number of pixels in a frame" 
and "determining percentages of each frame covered by the at least two largest regions." 
Therefore, amended claims 7 and 13 are patentable over the cited references, both alone and in 
combination, for at least the same reasons discussed above with respect to claim 1 . 

In addition to reciting their own patentable features, claims 2-6, 8-12 and 14-18 variously 
depend from patentable base claims 1, 7 and 13. Accordingly each of dependent claims 2-6, 8- 
12 and 14-18 are also patentable. 

The Examiner also rejected claim 1 6 as unpatentable over Chang and Osberger in view of 
Official Notice. However, the Official Notice relied upon by the Examiner does not overcome 
the deficiencies of Chang and Osberger. The Official Notice merely indicates that polar 
coordinates are a form of mathematical representation. However, this Official Notice does not 
disclose "identifying at least two largest regions in each frame using a look-ahead detector 
executed by the processor, wherein the first largest region includes motion vectors with 
substantially similar motion angles and occupies a largest number of pixels in a frame and the 
second largest region includes motion vectors with substantially similar motion angles and 
occupies a second largest number of pixels in a frame" and "determining percentages of each 
frame covered by the at least two largest regions," as claimed. Therefore, the combination of 
Chang, Osberger and Official Notice fails to disclose the subject matter of claim 16. 

Thus, claim 16 is patentably distinguishable over the cited references, both alone and in 
combination and withdrawal of the rejection is respectfully requested. 
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Conclusion 



Should the Examiner wish to discuss the above remarks, or if the Examiner believes that 
for any reason direct contact with Applicants' representative would help to advance the 
prosecution of this case to allowance, the Examiner is invited to telephone the undersigned at the 
number given below. 



Respectfully submitted, 

Adriana Dumitras and Barin G. Haskell 



Dated: May 31. 2009 By: /Brian G. Brannon/ 

Brian G. Brannon, Reg. No. 57,219 

FENWICK & WEST LLP 

801 California Street 

Mountain View, CA 94041 

Tel: (650)335-7610 

Fax: (650) 938-5200 

bbrannon@fenwick.com 
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